-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use solver result caching in CI #1908
Conversation
I am on vacation for the rest of the month, so unfortunately I won't be able to work on this again until September. Here's a summary of the status of the changes on this PR. Non-solver cache related
Solver caching
|
I don't think this is the fault of this PR. GHC 9.4+ uses a different approach to recompilation checking (see this blog post) that allows most of the local Haskell dependencies to be reused directly from the cache rather than rebuilding. Indeed, if you compare this GHC 9.4 CI job versus this GHC 9.2 CI job, the main thing that accounts for the 9.2 job taking so much longer is the |
This shouldn't be necessary, as caching is already limited to the current branch: https://github.com/actions/cache#cache-scopes |
@m-yac: what are the indicators that we would be looking for to tell if this is working? |
Oh, I just meant that the "Save SMT solver result cache" step of the s2n tests is failing. On the most recent run it failed with:
which looks like an easy-to-fix permissions issue. For some reason this doesn't give a red X, unlike previous times this step failed, which is maybe the source of the confusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the table in the PR description is to be believed, then this work will end up saving a ton of time on CI rebuilds in the future. Great work, @m-yac!
(Out of curiosity, how did you obtain the figures in that table? Perhaps this info was dumped somewhere in the CI logs, but I couldn't easily find it.)
Thanks for the review, @RyanGlScott! With the exception of the number of cache hits, which was from debugging information which is now turned off, I got the numbers in the table directly from Github's web interface. Specifically, from the right hand side when a job is selected. The specific CI runs I used are linked in the first column. For example, after going to this CI run from |
FYI: Looks like Github evicted the previously built solver caches due to age, thus the first attempt at running the CI on the latest commit latest had to rebuild all the caches from scratch (and so took longer than expected). I'm re-running the CI now that there are some caches to use just to make sure everything still works. In this future, when this PR is merged, this hopefully won't be a problem because we should always have fresh caches from |
Also note that I forgot all the tests run in parallel when I made my original analysis of the speedup. Taking into account that the |
This PR makes three changes:
--clean-solver-cache
command-line option tosaw
which provides a direct way to remove cache entries with out-of-date solver versions (this is run every time a cache is loaded on CI)rme
directory as the version information forrme
calls, instead of the commit hash of the entire repo (this ensures cachedrme
calls are actually used when CI is run on a later commit which does not modify therme
directory)The result is an average of a 1.57x speedup of the relevant parts of the CI. Because all the tests run in parallel, the only one which affects the overall runtime of the CI is the
awslc
test. Including solver caching consistently saves 35-40 minutes on this test. For two particular runs, the full breakdown is below – with the most dramatic speedups in bold.Note that there appears to be no correlation between the number of cache hits and the speedup, likely due to the fact that a removing a few very "difficult" solver queries results in much more speedup than removing lots of "easy" solver queries.